Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add featuregates for volcano capabilities #2989

Merged
merged 1 commit into from
Aug 16, 2023

Conversation

Monokaix
Copy link
Member

@Monokaix Monokaix commented Jul 21, 2023

Releated issue: #3012
Volcano supports many kubernetes native resources and uses these resources to serve the scheduling algorithm, it also provides high-level capabilities such as queues and queue commands. In serverless scenarios, we only focus on core resources such as pods/podgroup, we should disable some resources and capabilitis by feature gate.
So this pr:

  • Categorize basic and advanced capabilities of scheduling and add featuregates for advanced capabilities and set to true by default.
  • User can still use all the volcano capabilities without being aware of the classification, and set the featuregate explicitly to false to disable resources and capabilities that don't need in serveless use case.

@volcano-sh-bot volcano-sh-bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jul 21, 2023
@Monokaix Monokaix force-pushed the featuregate branch 2 times, most recently from 289c921 to eb2c40f Compare July 21, 2023 11:21
@Thor-wl
Copy link
Contributor

Thor-wl commented Jul 24, 2023

@Monokaix Thanks for your contribution. I think I've been aware of your requirements and ideas in some degree. Can you tell more about the meaning of this fearure? In another word, what will happen without the PR? Unacceptable memory usage or poor performance due to unusable cache?

@Thor-wl
Copy link
Contributor

Thor-wl commented Jul 24, 2023

/hold

@volcano-sh-bot volcano-sh-bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 24, 2023
@Monokaix
Copy link
Member Author

Monokaix commented Jul 24, 2023

@Monokaix Thanks for your contribution. I think I've been aware of your requirements and ideas in some degree. Can you tell more about the meaning of this fearure? In another word, what will happen without the PR? Unacceptable memory usage or poor performance due to unusable cache?

We hope that Volcano provides a basic framework, which only includes some core resources of k8s. Without this pr, Volcano will not start successfully if apiserver has no advanced capabilities API registered, which are showed in this pr, because it has to wait for the cache synchronization of informer resources.

@william-wang
Copy link
Member

@Monokaix Thanks for your contribution. I think I've been aware of your requirements and ideas in some degree. Can you tell more about the meaning of this fearure? In another word, what will happen without the PR? Unacceptable memory usage or poor performance due to unusable cache?

We hope that Volcano provides a basic framework, which only includes some core resources of k8s. Without this pr, Volcano will not start successfully if apiserver has no advanced capabilities API registered, which are showed in this pr, because it has to wait for the cache synchronization of informer resources.

Some user builds serverless container platform based on Volcano. The infra platform is a simplied kubernetes that have no object like CSI, priority class and so on. The current Volcano does not play with it. From my side, Volcano can be enhanced to
let user enable or disable the list/watch of CSI, priority class on demand and be more flexable.

@Monokaix suggest to file a enhancement issue and clarify the scenario and associate it with this pr.

@Monokaix Monokaix force-pushed the featuregate branch 3 times, most recently from 1695d41 to a36dc1b Compare August 1, 2023 06:36
@Thor-wl
Copy link
Contributor

Thor-wl commented Aug 1, 2023

From my side, Volcano can be enhanced to
let user enable or disable the list/watch of CSI, priority class on demand and be more flexable.

+1

@Monokaix Monokaix force-pushed the featuregate branch 4 times, most recently from 05ee33a to 9d4bdd3 Compare August 1, 2023 11:00
@Monokaix
Copy link
Member Author

Monokaix commented Aug 1, 2023

@Monokaix Thanks for your contribution. I think I've been aware of your requirements and ideas in some degree. Can you tell more about the meaning of this fearure? In another word, what will happen without the PR? Unacceptable memory usage or poor performance due to unusable cache?

We hope that Volcano provides a basic framework, which only includes some core resources of k8s. Without this pr, Volcano will not start successfully if apiserver has no advanced capabilities API registered, which are showed in this pr, because it has to wait for the cache synchronization of informer resources.

Some user builds serverless container platform based on Volcano. The infra platform is a simplied kubernetes that have no object like CSI, priority class and so on. The current Volcano does not play with it. From my side, Volcano can be enhanced to let user enable or disable the list/watch of CSI, priority class on demand and be more flexable.

@Monokaix suggest to file a enhancement issue and clarify the scenario and associate it with this pr.

Thanks, enhancement is added.

@Monokaix Monokaix force-pushed the featuregate branch 2 times, most recently from 400a886 to 6150894 Compare August 2, 2023 07:29
@Monokaix
Copy link
Member Author

Monokaix commented Aug 2, 2023

From my side, Volcano can be enhanced to
let user enable or disable the list/watch of CSI, priority class on demand and be more flexable.

+1

@Thor-wl Enhancement has been posted, Could you take a look once more?

@Thor-wl
Copy link
Contributor

Thor-wl commented Aug 3, 2023

@Thor-wl Enhancement has been posted, Could you take a look once more?

OK. Since this PR is a little big large, it may cost some time to reivew.

@@ -60,8 +60,6 @@ type ServerOption struct {
PrintVersion bool
EnableMetrics bool
ListenAddress string
EnablePriorityClass bool
EnableCSIStorage bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's take code compatibility into consideration.

Copy link
Member Author

@Monokaix Monokaix Aug 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the params are kept.

@@ -107,8 +105,6 @@ func (s *ServerOption) AddFlags(fs *pflag.FlagSet) {
fs.StringVar(&s.LockObjectNamespace, "lock-object-namespace", defaultLockObjectNamespace, "Define the namespace of the lock object that is used for leader election; it is volcano-system by default")
fs.StringVar(&s.ListenAddress, "listen-address", defaultListenAddress, "The address to listen on for HTTP requests.")
fs.StringVar(&s.HealthzBindAddress, "healthz-address", defaultHealthzAddress, "The address to listen on for the health check server.")
fs.BoolVar(&s.EnablePriorityClass, "priority-class", true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the params are kept.

@lowang-bh
Copy link
Member

lowang-bh commented Aug 14, 2023

Please provide some important test result snapshot about this feature. Thanks.

@Thor-wl
Copy link
Contributor

Thor-wl commented Aug 15, 2023

/hold cancel

@volcano-sh-bot volcano-sh-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 15, 2023
FilterFunc: func(obj interface{}) bool {
switch v := obj.(type) {
case *busv1alpha1.Command:
if v.TargetObject != nil &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to be simpled as return v.TargetObject != nil && v.TargetObject.APIVersion == batchv1alpha1.SchemeGroupVersion.String() && v.TargetObject.Kind == "Job"

)

const (
// WorkLoadSupport can cache and operate workload resource, Deployment/Replicas/ReplicationController/StatefulSet resources currently.
Copy link
Contributor

@Thor-wl Thor-wl Aug 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to update the comment to be WorkLoadSupport can cache and operate **K8s native resource**, Deployment/Replicas/ReplicationController/StatefulSet resources currently.. That may be easier to understand.

@Thor-wl
Copy link
Contributor

Thor-wl commented Aug 16, 2023

/lgtm

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Aug 16, 2023
@@ -33,6 +33,7 @@ import (
utilruntime "k8s.io/apimachinery/pkg/util/runtime"
"k8s.io/apimachinery/pkg/util/sets"
"k8s.io/apimachinery/pkg/util/wait"
utilfeature "k8s.io/apiserver/pkg/util/feature"
Copy link
Member

@william-wang william-wang Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please change utilfeature to be a more reasonable name

runtime.Must(utilfeature.DefaultMutableFeatureGate.Add(defaultVolcanoFeatureGates))
}

var defaultVolcanoFeatureGates = map[featuregate.Feature]featuregate.FeatureSpec{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change defaultVolcanoFeatureGates to defaultFeatureGates

Copy link
Member

@william-wang william-wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: william-wang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@volcano-sh-bot volcano-sh-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 16, 2023
@volcano-sh-bot volcano-sh-bot merged commit 433e976 into volcano-sh:master Aug 16, 2023
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants